IN THE CLAIMS 



1 . (Currently Amended) An apparatus, in an integrated circuit (IC) of a data processing 
system having at least one host processor and host memory, comprising: 
a chip interconnect; 

a host interface coupled to the chip interconnect for interfacing the IC with the at least 
one host processor external to the IC; 

a memory interface coupled to the chip interconnect for interfacing the IC with the 
host memory external to the IC; 

a memory controlle r coupled to the chip interconnect for controlling the host memory 
comprising DRAM memor y via the memory interfac e, th o memory controller 
coupled to th e chip int e rconnect ; 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 
being capable of executing instructions to perform scalar data processing; 

a vector processing unit coupled to the chip interconnect, the vector processing unit 
being capable of executing instructions to perform vector data processing; and 

an input and output (I/O) interface coupled to the chip interconnect , th e I/O interfac e 
rec e iving/transmitting data from/to the scalar and/or v e ctor processing units Jor 
interfacing the IC with an I/O controller of the data processing system, the I/O 
controller being external to the IC for controlling I/O devices of the data 
processing system , wherein the chip interconnect, the memory controller, the 
scalar processing unit, the vector processing unit, a^d-the I/O interface , the host 
interface, and the memory interface are implemented within the IC which is a 
single chipset interfacing the at least one host processor and the host memory 
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with other components of the data processing system , including the I/O 
controller and the I/O devices . 

2. (Previously Presented) The apparatus of claim 1, further comprising a switch mechanism 
coupled the chip interconnect and coupled to the scalar processing unit and coupled to the 
vector processing unit, the switch mechanism operable to receive multiple media data streams 
from the I/O interface and dispatch the multiple media data streams to the scalar processing 
unit and/or the vector processing unit. 

3. (Previously Presented) The apparatus of claim 1, further comprising: 

multiple scalar processing units, the multiple scalar processing units being capable of 
executing instructions to perform scalar processing substantially 
simultaneously; and 

multiple vector processing units, the multiple vector processing units being capable of 
executing instructions to perform vector processing substantially 
simultaneously. 

4. (Original) The apparatus of claim 3, further comprising multiple scalar processing 
units of a kind and multiple vector processing unit of a kind. 

5. (Original) The apparatus of claim 1, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 
a vector register (VR) file coupled to the vector processing unit; and 
a load and store unit (LSU), the LSU being capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU being capable of 
executing instructions to load and store vector data from and to the VR. 
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6. (Original) The apparatus of claim 5, further comprising a memory location coupled to 
the chip interconnect, wherein the LSU loads and stores data from and to the memory 
location. 

7. (Original) The apparatus of claim 6, further comprising a direct memory access (DMA) 
engine, the DMA engine transferring the multiple media data between the memory location 
and the host memory. 

8. (Original) The apparatus of claim 5, wherein the LSU is capable of executing 
instructions to load and store various formats of scalar and vector data, wherein the various 
formats comprise 8-bit, 16-bit, and 32-bit formats. 

9. (Previously Presented) The apparatus of claim 1, wherein the switch mechanism 
comprises an instruction unit (IUNIT), the IUNIT controlling and dispatching instructions 
substantially simultaneously. 

10. (Original) The apparatus of claim 9, wherein the instructions comprise very long 
instruction word (VLIW) instructions. 

1 1 . (Original) The apparatus of claim 9, wherein the IUNIT further comprises: 

a program counter; 

a branch unit, wherein the program counter and the branch unit determine the location 

to fetch next instructions; 
an instruction cache memory, the instruction cache memory comprising instruction 

cache tag and data memories for buffering instructions transmitted from the 

host memory; and 
at least one memory mapped registers accessible by the host. 
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12. (Original) The apparatus of claim 1 , wherein the scalar processing unit comprises: 

an integer arithmetic and logic unit (IALU), the IALU being capable of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 
and 

an integer shift unit (ISHU), the ISHU being capable of executing instructions to 
perform scalar bit shifting and rotating operations; 

13. (Original) The apparatus of claim 12, wherein the scalar processing unit further 
comprises a floating point unit (FPU), the FPU being capable of executing instructions to 
perform high precision scalar data processing. 

14. (Original) The apparatus of claim 1, wherein the vector processing unit comprises: 

a vector permute unit (VPU), the VPU being capable of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSRJ), the VSIU being capable of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU being capable of executing 

instructions to perform vector complex integer arithmetic operations; and 
a vector look-up table unit (VLUT), the VLUT being capable of executing instructions 

to perform at least one vector table look-up. 

15. (Original) The apparatus of claim 14, wherein the vector processing unit further 
comprises a vector floating point unit (VFPU), the VFPU being capable of executing 
instructions to perform high precision vector data processing. 
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16. (Original) The apparatus of claim 14, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 

17. (Original) The apparatus of claim 16, wherein data of the LUT are transferred from the 
host memory to the memory location through a direct memory access (DMA) operation. 

1 8. (Original) The apparatus of claim 1 6, wherein the memory location comprises a static 
random access memory (SRAM). 

19. (Original) The apparatus of claim 1, wherein the scalar and vector processing units are 
capable of performing data processing autonomously and asynchronously to the host 
processor. 

20. (Original) The apparatus of claim 1, wherein the scalar and vector processing units 
communicate with the host processing through an interrupt mechanism. 

21 . (Original) The apparatus of claim 1 , wherein the scalar and vector processing units are 
accessible by the host processor, through a set of memory mapped addresses. 

22. (Original) The apparatus of claim 1, wherein the IC may be a co-processor to the host, 
wherein the IC may be a stand-alone processor coupled to a bus of the data processing 
system, and wherein the chipset may be a core logic chip having a host interface coupled 
to the host processor and memory interface coupled to the host memory. 

23. (Original) The apparatus of claim 5, further comprises a special purpose register (SPR) 
file coupled to the chip interconnect. 
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24. (Currently Amended) A method, in an integrated circuit (IC) having a chip interconnect, 
of a data processing system having at least one host processor and a host memory, the method 
comprising: 

receiving a data stream from an input/output (I/O) interface coupled to the chip 

interconnect , the I/O interface capable of being coupled to an I/O controller 
external to the IC, the I/O controller controlling I/O devices of the data 
processing system external to the IC ; 

examining data of the data stream to determine whether the data requires scalar data 
processing or vector data processing; 

performing scalar data processing on the data in the IC, if the data requires scalar data 
processing; and 

performing vector data processing on the data in the IC, if the data requires vector data 
processing, wherein receiving the data stream, examining the data, the scalar 
data processing, and the vector data processing are performed within the IC 
which is a single chipset interfacing the at least one host processor and the host 
memory with other components of the data processing system including the I/O 
controller and the I/O devices, wherein the at least one host processor, the host 
memory, the I/O controller, and the I/O devices are external to the IC, and 
wherein the I/O controller and the I/O devices communicate with the host 
processor and the host memory via the IC . 

25. (Original) The method of claim 24, further comprising: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 

performing scalar data processing on the data; and 
a vector processing unit coupled to the chip interconnect, the vector processing unit 

performing vector data processing on the data. 
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26. (Previously Presented) The method of claim 25, further comprising multiple scalar 
processing units performing scalar data processing substantially simultaneously and multiple 
vector processing units performing vector data processing substantially simultaneously. 

27. (Previously Presented) The method of claim 25, further comprising: 

dispatching the data to the scalar processing unit if the data requires scalar data 
processing; and 

dispatching the data to the vector processing unit if the data requires vector data 
processing. 

28. (Original) The method of claim 27, wherein the dispatching is performed by a switch 
mechanism coupled to the chip interconnect, the switch mechanism receiving the data from 
the I/O interface. 

29. (Original) The method of claim 28, wherein the switch mechanism comprises an 
instruction unit (IUNIT), the IUNIT being capable of decoding the data. 

30. (Original) The method of claim 24, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; 

a special purpose register (SPR) coupled to the chip interconnect; and 

a load and store unit (LSU), the LSU being capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU being capable of 
executing instructions to load and store vector data from and to the VR. 

3 1 . (Original) The method of claim 30, further comprising a memory location coupled to the 
chip interconnect, wherein the LSU loads and stores data from and to the memory location. 
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32. (Original) The method of claim 3 1 , further comprising transferring the data between the 
memory location and the host memory, through a direct memory access (DMA) operation. 

33. (Original) The method of claim 24, wherein the scalar processing unit comprises: 

an integer arithmetic and logic unit (IALU), the IALU being capable of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU being capable of executing instructions to 
perform scalar bit shifting and rotating operations; and 

a floating point unit (FPU), the FPU being capable of executing instructions to 
perform high precision scalar data processing. 

34. (Original) The method of claim 24, wherein the vector processing unit comprises: 

a vector permute unit (VPU), the VPU being capable of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU being capable of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU being capable of executing 

instructions to perform vector complex integer arithmetic operations; 
a vector look-up table unit (VLUT), the VLUT being capable of executing instructions 

to perform at least one vector table look-up; and 
a vector floating point unit (WPU), the VFPU being capable of executing instructions 

to perform high precision vector data processing. 

35. (Original) The method of claim 34, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 
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36. (Original) The method of claim 35, further comprising transferring data of the LUT from 
the host memory to the memory location, through a direct memory access (DMA) operation. 

37. (Original) The method of claim 24, wherein the scalar data processing and vector data 
processing are performed autonomously and asynchronously to the host processor. 

38. (Original) The method of claim 25, wherein the scalar processing unit and the vector 
processing unit communicate with the host processor through an interrupt mechanism. 

39. (Original) The method of claim 25, wherein the scalar processing unit and the vector 
processing unit are accessible by the host processing through a set of memory mapped 
addresses. 

40. (Currently Amended) An apparatus, in an integrated circuit (IC) having a chip 
interconnect, of a data processing system having at least one host processor and a host 
memory, the apparatus comprising: 

means for receiving a data stream from an input/output (I/O) interface coupled to the 
chip interconnect , the I/O interface capable of being coupled to an I/O 
controller external to the IC, the I/O controller controlling I/O devices of the 
data processing system external to the IC ; 

means for examining the data to determine whether the data requires scalar data 
processing or vector data processing; 

means for performing scalar data processing on the data in the IC, if the data requires 
scalar data processing; and 

means for performing vector data processing on the data in the IC, if the data requires 
vector data processing, wherein receiving the data stream, examining the data, 
the scalar data processing, and the vector data processing are performed within 



Application Serial No. 10/038,742 



-10- 



Atty. Docket No. 4860P2691 



the IC which is a single chipset interfacing the at least one host processor and 
the host memory with other components of the data processing system 
including the I/O controller and the I/O devices, wherein the at least one host 
processor, the host memory, the I/O controller, and the I/O devices are external 
to the IC, and wherein the I/O controller and the I/O devices communicate with 
the host processor and the host memory via the IC . 

41 . (Original) The apparatus of claim 40, further comprising: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 

performing scalar data processing on the data; and 
a vector processing unit coupled to the chip interconnect, the vector processing unit 

performing vector data processing on the data. 



42. (Previously Presented) The apparatus of claim 41 , further comprising multiple scalar 
processing units performing scalar data processing substantially simultaneously and 
multiple vector processing units performing vector data processing substantially 
simultaneously. 



43. (Previously Presented) The apparatus of claim 41 , further comprising: 

means for dispatching the data to the scalar processing unit if the data requires scalar 
data processing; and 

means for dispatching the data to the vector processing unit if the data requires vector 
data processing. 



44. (Original) The apparatus of claim 43, wherein the dispatching is performed by a switch 
mechanism coupled to the chip interconnect, the switch mechanism receiving the data 
from the I/O interface. 
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45. (Original) The apparatus of claim 44, wherein the switch mechanism comprises an 
instruction unit (IUNIT), the IUNTT being capable of decoding the data. 

46. (Original) The apparatus of claim 40, further comprising: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; 

a special purpose register (SPR) coupled to the chip interconnect; and 

a load and store unit (LSU), the LSU being capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU being capable of 
executing instructions to load and store vector data from and to the VR. 

47. (Original) The apparatus of claim 46, further comprising a memory location coupled to 
the chip interconnect, wherein the LSU loads and stores data from and to the memory 
location. 

48. (Original) The apparatus of claim 47, further comprising means for transferring the data 
between the memory location and the host memory, through a direct memory access (DMA) 
operation. 

49. (Original) The apparatus of claim 40, wherein the scalar processing unit comprises: 

an integer arithmetic and logic unit (IALU), the IALU being capable of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU being capable of executing instructions to 
perform scalar bit shifting and rotating operations; and 

a floating point unit (FPU), the FPU being capable of executing instructions to 
perform high precision scalar data processing. 
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50. (Original) The apparatus of claim 40, wherein the vector processing unit comprises: 

a vector permute unit (VPU), the VPU being capable of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU being capable of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU being capable of executing 

instructions to perform vector complex integer arithmetic operations; 
a vector look-up table unit (VLUT), the VLUT being capable of executing instructions 

to perform at least one vector table look-up; and 
a vector floating point unit (VFPU), the VFPU being capable of executing instructions 

to perform high precision vector data processing. 

5 1 . (Original) The apparatus of claim 50, wherein the VLUT comprises a memory location 
storing at least one look-up table (LUT). 

52. (Original) The apparatus of claim 51, further comprising means for transferring data of 
the LUT from the host memory to the memory location, through a direct memory access 
(DMA) operation. 

53. (Original) The apparatus of claim 40, wherein the scalar data processing and vector data 
processing are performed autonomously and asynchronously to the host processor. 

54. (Original) The apparatus of claim 41 , wherein the scalar processing unit and the vector 
processing unit communicate with the host processor through an interrupt mechanism. 
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55. (Original) The apparatus of claim 41, wherein the scalar processing unit and the vector 
processing unit are accessible by the host processing through a set of memory mapped 
addresses. 

56. (Currently Amended) A machine readable medium having stored thereon executable 
code which causes a machine to perform a method, in an integrated circuit (IC) having a chip 
interconnect, of a data processing system having at least one host processor and a host 
memory, the method comprising: 

receiving a data stream from an input/output (I/O) interface coupled to the chip 

interconnect , the I/O interface capable of being coupled to an I/O controller 
external to the IC, the I/O controller controlling I/O devices of the data 
processing system external to the IC ; 

examining the data to determine whether the data requires scalar data processing or 
vector data processing; 

performing scalar data processing on the data in the IC, if the data requires scalar data 
processing; and 

performing vector data processing on the data in the IC, if the data requires vector data 
processing, wherein receiving the data stream, examining the data, the scalar 
data processing, and the vector data processing are performed within the IC 
which is a single chipset interfacing the at least one host processor and the host 
memory with other components of the data processing system including the I/O 
controller and the I/O devices, wherein the at least one host processor, the host 
memory, the I/O controller, and the I/O devices are external to the IC, and 
wherein the I/O controller and the I/O devices communicate with the host 
processor and the host memory via the IC . 
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57. (Original) The machine readable medium of claim 56, wherein the method further 
comprises: 

a scalar processing unit coupled to the chip interconnect, the scalar processing unit 

performing scalar data processing on the data; and 
a vector processing unit coupled to the chip interconnect, the vector processing unit 

performing vector data processing on the data. 

58. (Previously Presented) The machine readable medium of claim 57, wherein the method 
further comprises multiple scalar processing units performing scalar data processing 
substantially simultaneously and multiple vector processing units performing vector data 
processing substantially simultaneously. 

59. (Previously Presented) The machine readable medium of claim 57, wherein the method 
further comprises: 

dispatching the data to the scalar processing unit if the data requires scalar data 
processing; and 

dispatching the data to the vector processing unit if the data requires vector data 
processing. 

60. (Original) The machine readable medium of claim 59, wherein the dispatching is 
performed by a switch mechanism coupled to the chip interconnect, the switch mechanism 
receiving the data from the I/O interface. 

61 . (Original) The machine readable medium of claim 60, wherein the switch mechanism 
comprises an instruction unit (IUNIT), the IUNIT being capable of decoding the data. 
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62. (Original) The machine readable medium of claim 56, wherein the method further 
comprises: 

a general purpose register (GPR) file coupled to the scalar processing unit; 

a vector register (VR) file coupled to the vector processing unit; 

a special purpose register (SPR) coupled to the chip interconnect; and 

a load and store unit (LSU), the LSU being capable of executing instructions to load 
and store scalar data from and to the GPR, and the LSU being capable of 
executing instructions to load and store vector data from and to the VR. 

63. (Original) The machine readable medium of claim 62, wherein the method further 
comprises a memory location coupled to the chip interconnect, wherein the LSU loads and 
stores data from and to the memory location. 

64. (Original) The machine readable medium of claim 63, wherein the method further 
comprises transferring the data between the memory location and the host memory, through a 
direct memory access (DMA) operation. 

65. (Original) The machine readable medium of claim 56, wherein the scalar processing unit 
comprises: 

an integer arithmetic and logic unit (IALU), the IALU being capable of executing 

instructions to perform simple scalar integer arithmetic and logical operations; 

an integer shift unit (ISHU), the ISHU being capable of executing instructions to 
perform scalar bit shifting and rotating operations; and 

a floating point unit (FPU), the FPU being capable of executing instructions to 
perform high precision scalar data processing. 
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66. (Original) The machine readable medium of claim 56, wherein the vector processing 
unit comprises: 

a vector permute unit (VPU), the VPU being capable of executing instructions to 

perform vector permute operations; 
a vector simple integer unit (VSIU), the VSIU being capable of executing instructions 

to perform vector simple integer arithmetic and logical operations; 
a vector complex integer unit (VCIU), the VCIU being capable of executing 

instructions to perform vector complex integer arithmetic operations; 
a vector look-up table unit (VLUT), the VLUT being capable of executing instructions 

to perform at least one vector table look-up; and 
a vector floating point unit (VFPU), the VFPU being capable of executing instructions 

to perform high precision vector data processing. 

67. (Original) The machine readable medium of claim 66, wherein the VLUT comprises a 
memory location storing at least one look-up table (LUT). 

68. (Original) The machine readable medium of claim 67, wherein the method further 
comprises transferring data of the LUT from the host memory to the memory location, 
through a direct memory access (DMA) operation. 

69. (Original) The machine readable medium of claim 56, wherein the scalar data processing 
and vector data processing are performed autonomously and asynchronously to the host 
processor. 

70. (Original) The machine readable medium of claim 57, wherein the scalar processing unit 
and the vector processing unit communicate with the host processor through an interrupt 
mechanism. 
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71. (Original) The machine readable medium of claim 57, wherein the scalar processing unit 
and the vector processing unit are accessible by the host processing through a set of memory 
mapped addresses. 
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